31 research outputs found

    Incremental learning of concept drift from imbalanced data

    Get PDF
    Learning data sampled from a nonstationary distribution has been shown to be a very challenging problem in machine learning, because the joint probability distribution between the data and classes evolve over time. Thus learners must adapt their knowledge base, including their structure or parameters, to remain as strong predictors. This phenomenon of learning from an evolving data source is akin to learning how to play a game while the rules of the game are changed, and it is traditionally referred to as learning concept drift. Climate data, financial data, epidemiological data, spam detection are examples of applications that give rise to concept drift problems. An additional challenge arises when the classes to be learned are not represented (approximately) equally in the training data, as most machine learning algorithms work well only when the class distributions are balanced. However, rare categories are commonly faced in real-world applications, which leads to skewed or imbalanced datasets. Fraud detection, rare disease diagnosis, anomaly detection are examples of applications that feature imbalanced datasets, where data from category are severely underrepresented. Concept drift and class imbalance are traditionally addressed separately in machine learning, yet data streams can experience both phenomena. This work introduces Learn++.NIE (nonstationary & imbalanced environments) and Learn++.CDS (concept drift with SMOTE) as two new members of the Learn++ family of incremental learning algorithms that explicitly and simultaneously address the aforementioned phenomena. The former addresses concept drift and class imbalance through modified bagging-based sampling and replacing a class independent error weighting mechanism - which normally favors majority class - with a set of measures that emphasize good predictive accuracy on all classes. The latter integrates Learn++.NSE, an algorithm for concept drift, with the synthetic sampling method known as SMOTE, to cope with class imbalance. This research also includes a thorough evaluation of Learn++.CDS and Learn++.NIE on several real and synthetic datasets and on several figures of merit, showing that both algorithms are able to learn in some of the most difficult learning environments

    Hellinger distance based drift detection for nonstationary environments

    Full text link
    Abstract—Most machine learning algorithms, including many online learners, assume that the data distribution to be learned is fixed. There are many real-world problems where the distribu-tion of the data changes as a function of time. Changes in nonstationary data distributions can significantly reduce the ge-neralization ability of the learning algorithm on new or field data, if the algorithm is not equipped to track such changes. When the stationary data distribution assumption does not hold, the learner must take appropriate actions to ensure that the new/relevant in-formation is learned. On the other hand, data distributions do not necessarily change continuously, necessitating the ability to monitor the distribution and detect when a significant change in distribution has occurred. In this work, we propose and analyze a feature based drift detection method using the Hellinger distance to detect gradual or abrupt changes in the distribution. Keywords-concept drift; nonstationary environments; drift detection I

    Shadows Aren't So Dangerous After All: A Fast and Robust Defense Against Shadow-Based Adversarial Attacks

    Full text link
    Robust classification is essential in tasks like autonomous vehicle sign recognition, where the downsides of misclassification can be grave. Adversarial attacks threaten the robustness of neural network classifiers, causing them to consistently and confidently misidentify road signs. One such class of attack, shadow-based attacks, causes misidentifications by applying a natural-looking shadow to input images, resulting in road signs that appear natural to a human observer but confusing for these classifiers. Current defenses against such attacks use a simple adversarial training procedure to achieve a rather low 25\% and 40\% robustness on the GTSRB and LISA test sets, respectively. In this paper, we propose a robust, fast, and generalizable method, designed to defend against shadow attacks in the context of road sign recognition, that augments source images with binary adaptive threshold and edge maps. We empirically show its robustness against shadow attacks, and reformulate the problem to show its similarity to Îľ\varepsilon perturbation-based attacks. Experimental results show that our edge defense results in 78\% robustness while maintaining 98\% benign test accuracy on the GTSRB test set, with similar results from our threshold defense. Link to our code is in the paper.Comment: This is a draft version - our core results are reported, but additional experiments for journal submission are still being ru

    Fizzy: feature subset selection for metagenomics

    Get PDF
    BACKGROUND: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using Îą- & β-diversity. Feature subset selection - a sub-field of machine learning - can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate between age groups in the human gut microbiome. RESULTS: We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. CONCLUSIONS: We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy.This item is part of the UA Faculty Publications collection. For more information this item or other items in the UA Campus Repository, contact the University of Arizona Libraries at [email protected]

    Approximate kernel reconstruction for time-varying networks

    Get PDF
    Most existing algorithms for modeling and analyzing molecular networks assume a static or time-invariant network topology. Such view, however, does not render the temporal evolution of the underlying biological process as molecular networks are typically “re-wired” over time in response to cellular development and environmental changes. In our previous work, we formulated the inference of time-varying or dynamic networks as a tracking problem, where the target state is the ensemble of edges in the network. We used the Kalman filter to track the network topology over time. Unfortunately, the output of the Kalman filter does not reflect known properties of molecular networks, such as sparsity

    Knowledge Distillation Under Ideal Joint Classifier Assumption

    Full text link
    Knowledge distillation is a powerful technique to compress large neural networks into smaller, more efficient networks. Softmax regression representation learning is a popular approach that uses a pre-trained teacher network to guide the learning of a smaller student network. While several studies explored the effectiveness of softmax regression representation learning, the underlying mechanism that provides knowledge transfer is not well understood. This paper presents Ideal Joint Classifier Knowledge Distillation (IJCKD), a unified framework that provides a clear and comprehensive understanding of the existing knowledge distillation methods and a theoretical foundation for future research. Using mathematical techniques derived from a theory of domain adaptation, we provide a detailed analysis of the student network's error bound as a function of the teacher. Our framework enables efficient knowledge transfer between teacher and student networks and can be applied to various applications

    Fusion methods for boosting performance of speaker identification systems

    Full text link
    Abstract—Two important components of a speaker identifica-tion system are the feature extraction and the classification tasks. First, features must be robust to noise and they must also be able to provide discriminating information that the classifier can use to determine the speaker’s identity. Second, the classifier must take the features that have been extracted from a sentence and label them as corresponding to one of the enrolled speakers. However, sets of features may be even more beneficial than any single feature by itself. There may be information present in one feature that other features do not have. Therefore, we present analysis of features and fusion by employing probabilistic averaging and weighted majority voting. Weighted voting will require that the weights are determined in a non-heuristic methodology and are robust to data with a large amount of channel distortion. Results using the King database show that both fusion methods lead to enhanced performance. I

    Review of Automatic Speech Recognition Methodologies

    Get PDF
    DTFACT-14-D-00004,692M152240001This report highlights the crucial role of Automatic Speech Recognition (ASR) techniques in enhancing safety for air traffic control (ATC) in terminal environments. ASR techniques facilitate efficient and accurate transcription of verbal communications, reducing the likelihood of errors. The report also details the evolution of ASR technologies, converging to machine learning approaches from Hidden Markov Models (HMMs), Deep Neural Networks (DNNs) to End-to-End models. Finally, the report details the latest advancements in ASR techniques, focusing on transformer-based models that have outperformed traditional ASR approaches and achieved state-of-the-art results on ASR benchmarks

    Understanding the Origins of Bacterial Resistance to Aminoglycosides through Molecular Dynamics Mutational Study of the Ribosomal A-Site

    Get PDF
    Paromomycin is an aminoglycosidic antibiotic that targets the RNA of the bacterial small ribosomal subunit. It binds in the A-site, which is one of the three tRNA binding sites, and affects translational fidelity by stabilizing two adenines (A1492 and A1493) in the flipped-out state. Experiments have shown that various mutations in the A-site result in bacterial resistance to aminoglycosides. In this study, we performed multiple molecular dynamics simulations of the mutated A-site RNA fragment in explicit solvent to analyze changes in the physicochemical features of the A-site that were introduced by substitutions of specific bases. The simulations were conducted for free RNA and in complex with paromomycin. We found that the specific mutations affect the shape and dynamics of the binding cleft as well as significantly alter its electrostatic properties. The most pronounced changes were observed in the U1406C∜U1495A mutant, where important hydrogen bonds between the RNA and paromomycin were disrupted. The present study aims to clarify the underlying physicochemical mechanisms of bacterial resistance to aminoglycosides due to target mutations
    corecore